home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Format CD 42
/
Amiga Format AFCD42 (Issue 126, Aug 1999).iso
/
-serious-
/
programming
/
other
/
wild
/
appunti
/
fasttexture.txt
< prev
next >
Wrap
Text File
|
1999-05-25
|
5KB
|
117 lines
Uhm... what about doing something good & fast !?
What can speedup a texturemapped drawing module !?
1) Elimination of useless pixels: use scanline methods.
2) Longword writes to chunky buffer, if possible: use a tmp register.
3) Uhm... 3 tables are too much, and also 6bit is too much, because
in a 256colors palette you will never find so perfect things...reduce!
4) So, less table-access!
5) More scanline precalcs.
All these have been tested on Escape, and were quite good.
1) Scanline methods.
Two methods:
a)- a MoreCrack.PEngY like (longword scanline buffer, scanlines,
broker routine, drawing from far to near, TRANSPARENCY PROBLEMS!);
b)- a dynamic method, with a list-like method. (with also some dynamic
start-line pointers!?)
a) I think i'll do something with method b. (that is good, but needs LOTS of mem! 4*chunkybuffer!!)
b) List memory: a fixed size buffer, big enough to contain a good number of scanlines.
The size should be fixed basing also on the resolution. And, CHECK the buffer,
to avoid overflows. Maybe, increase it dinamically!
Scanline struct:
NEXT.l
(PREC.l) I think i can obmit it... i hope so...
FLAGS.b For special scanlines: transparent !!
HOLE00.b
ID.w Polygon ID.
LEN.w Scanline LEN.
The starting buffer may be filled with a scanline for every row: a single
scanline would overflow in LEN, and would be difficult to manage (there is
no y anywhere!).
The direction:
a) near to far;
b) far to near;
a) It's good because you have not to add useless scanlines: if a scanline
is back another, you can skip it suddently. With transparencies: they
are flagged, so then it's easy to make a specific case for back-transparent
scanlines. Good also because you can manage hard cases: a transparent back
a transparent: you have suddently the case, so you can decide if skip it or
what. It's bas to manage the background and the starting situation.
b) It's bad because you must do even scanlines that then will be covered;
for transparents, it's a bit better, because when you covered all transparent
scanlines with the transparent, you haven't to do more.
2) A quite old idea, but never implemented, and not so easy: a tmp register, common
to all drawing routines (plain,textured,shaded,bright,burning,...so, i can stop
when i want with one and start with another) used to write to mem. Check if it is
a good idea, to comply the source: i want a good performance to do that !
3/4) Table reduction:
In a 256color palette, i think it's good if i find 4bit precision.
To reduce the number of accesses: I can do 2 tables instead of 3 (1 table is
too much big: 1MB!). A table would correct the R, and another the GB. (so,68k)
Correction tables: Light scaling table (cuts the light)
Add table (adds the components to the color)
Better: A Table working on R|GB to .... (working on PAL not on R|GB,something? think!)
And what about R|G|B tables, to allow more tables-> more effects,like alpha channel?
(imagine R|G|B tables to cut light with alpha channel: for lights is not useful,
buf i wuold be able to make textures with alpha channel!)
Needed: (4k*3)*alphashades.
So: 12k*32 alphashades =384k ! not too much !
Using 16 shades: 192 k !! better !!
5) During the Sbima writing, i 've seen i need a LOT of mem read/write to update
the steps, and i have to load every time them.
See the needed r/w for every x,tx,ty,i,.. update:
r: the y
r: the cy
cmp them
r: the value
if to update
r: the step
w: the new value
r: the changey
cmp with cy
if to update
r: the newstep
w: the newstep over the oldstep (media: 5r 1w) for scanline
(max: 6r 2w) for scanline
With a precalced expanded scanline (comprending also the inits and steps) :
- precalc cycle: (done polygon by polygon)
(h=have in registers)
h: y (=cy)
h: the value
h: the step
w: the value
h: changey
if to update
r: the newstep
- draw cycle:
r: the value (media: 1r 1w for scanline)
(max: 2r 1w for scanline)
The more memory needed is not so much: 10 bytes max for scanline (x.l tx.w ty.w i.w)
Maybe even less (x.w tx.b ty.b i.b)
More advantages: the precalc may stay in cache (256bytes) and be more pipelined.
EVEN BETTER: Do that in the ScanlineMaker loop !? The first done ARE SURE TO BE
USED! So, no useless calcings!!